from: http://www.stat.pitt.edu/stoffer/tsa4/tsgraphics.htm
Here are some tips for plotting time series in R using base graphics, ggplot2, and ggfortify (which can be used for quick gg-plotting). I’ll do some comparisons here. Graphics using ggplot2 has become sort of a religion among hipsteR neRds, but you don’t have to become a convert to use it, nor a heretic if you don’t … it’s a nice tool - just don’t fall into the hype. Sometimes a line is just a line. — Anonymous.
The examples use astsa for the data, and ggplot2 and ggfortify for the graphics.
library(astsa)
library(ggplot2)
library(ggfortify)
The graphs are saved to a png file in the working directory. Skip the lines png(…) and dev.off() if you just want to test the result.
First, here’s a plot of gtemp using the base graphics. If you add a grid after you plot, it goes on top. You have some work to do if you want the grid underneath… but at least you can work around that - read on.
# for a basic plot, all you need is
plot(gtemp) # it can't get simpler than that (not shown)
plot(gtemp, type='o', col=4) # a slightly nicer version (not shown)
# but here's a pretty version that includes a grid
#png(file='gtemp.png', width=600, height=320)
par(mar=c(2,2,0,0)+.5, mgp=c(1.6,.6,0)) # trim the margins
plot(gtemp, ylab='Temperature Deviations', type='n') # set up the plot
grid(lty=1, col=gray(.9)) # add a grid
lines(gtemp, type='o', col=4) # and now plot the line
#dev.off()
And here’s a version that looks like a gg-plot:
#png(file='ggtemp.png', width=600, height=320)
par(mar=c(2,2,0,0)+.5, mgp=c(1.6,.4,0), tcl=-.2, las=1, cex.axis=.9)
plot(gtemp, ylab='Temperature Deviations', type='n')
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col=gray(.9,.9), border='white')
grid(lty=1, col='white')
lines(gtemp, type='o', col=4)
# dev.off()
And a ggplot2 of the gtemp series. ggplot2 doesn’t play with time series so you have to create a data frame that strips the series to its bare naked data.
#png(file='gtemp1.png', width=600, height=320)
gtemp.df = data.frame(Time=c(time(gtemp)), gtemp=c(gtemp)) # strip the ts attributes
ptemp = ggplot(data = gtemp.df, aes(x=Time, y=gtemp) ) + # store it
ylab('Temperature Deviations') +
geom_line(col="blue") +
geom_point(col="blue", pch=1)
ptemp # plot it
#dev.off()
ptemp + theme_bw() # not shown
Here’s a similar plot using ggfortify, which plays well with time series. If you don’t want the points, it’s a one-liner (like a duck walks into a bar, orders a drink and says put it on my bill). Also note that the theme_bw() is added.
#png(file='gtemp2.png', width=600, height=320)
autoplot(gtemp, ylab='Temperature Deviations', xlab='Time', colour=4) + # this is enough for a quick plot
geom_point(aes(x=time(gtemp), y=gtemp), color=4) + # all this just to add points
theme_bw() # and this to change the theme
#dev.off()
â—Š You can see a major difference between gg-plotting and base-plotting. In base graphics, you do things one at a time. If I input plot(x) and then grid(), the graphic is developed according to the order entered. Thus the graphic is produced with the grid on top of the line. With gg-plotting, nothing is done until after all the input is finished. You can even save the results in an object and add instructions later.
Here’s an example of ggplot for two time series, one at a time (not the best way for many time series).
#png(file='gtemps1.png', width=600, height=320)
gtemp.df = data.frame(Time=c(time(gtemp)), gtemp=c(gtemp), gtemp2=c(gtemp2))
ggplot(data = gtemp.df, aes(x=Time, y=value, color=variable ) ) +
ylab('Temperature Deviations') +
geom_line(aes(y=gtemp , col='gtemp'), size=1, alpha=.5) +
geom_line(aes(y=gtemp2, col='gtemp2'), size=1, alpha=.5)
#dev.off()
A similar plot using ggfortify.
# png(file='gtemps2.png', width=600, height=320)
autoplot(cbind(Land_Marine=gtemp, Land_Only=gtemp2), ylab=expression(degree~C), xlab="Time",
size=1, alpha=.5, facets = FALSE) +
theme(legend.position=c(.1,.85)) # if left off, the plot is like the previous one
# dev.off()
And a similar plot using base graphics (and cotton candy colors).
# png(file='gtempsbase.png', width=600, height=320)
par(mar=c(2,2,0,0)+.5, mgp=c(1.6,.6,0))
ts.plot(gtemp, gtemp2, ylab="Gloabal Temp Devs", xlab="Time", main='', type='n')
grid(lty=1, col=gray(.9))
lines(gtemp, lwd=2, col = rgb(.9, 0, .7, .5) )
lines(gtemp2, lwd=2, col = rgb( 0, .7, .9, .5) )
legend('topleft', col=c(rgb(.9, 0, .7), rgb(0, .7, .9)), lty=1, lwd=2,
legend=c("Land/Marine", "Land Only"), bg='white')
#dev.off()
Both temperature series, separately, using ggfortify. Here, facets = TRUE by default. An example of plotting multiple series separately using ggplot is below for the explosions.
# png(file='gtemps3.png', width=600, height=320)
autoplot(cbind(gtemp, gtemp2), ylab="It's Getting Hot in Here", xlab="emiT", size=1, colour='blue')
#dev.off()
And another example using three different types of series from the L.A. Pollution Study. The scales are different, mortality is a count, temperature is in ℉, and particulates is in PPM.
# png(file='cmort1.png', width=600)
autoplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), xlab='Time', colour='blue') +
guides(colour=FALSE) # shuts off legend
#dev.off()
Just don’t ask me how to get the y-labels on the three different series without just plotting three separate graphs. I’m sure it can be done …. Update: I found how to plot differently scaled multiple time series with ggplot2 on GitHub. I’m not sure it’s worth the trouble.
A similar quick plot using base graphics (not shown):
plot(cbind(cmort, tempr, part), col=4, main='Hot and Smoggy')
Here’s a pretty version in base graphics. The code is a little long, but repetitive, so you can copy-and-paste to save your wrists a little anguish. This example shows the BEST thing about base graphics… you can put anything anywhere.
# png(file='mtp_base.png', width=600)
par(mfrow=c(3,1), mar=c(0,3.5,0,3), oma=c(3.5,0,2,0), mgp=c(2,.6,0), cex.lab=1.1, tcl=-.3, las=1)
plot(cmort, ylab=expression(M[~t]~~~~(Number~Buried)), xaxt="no", type='n')
grid(lty=1, col=gray(.9))
lines(cmort, col=rgb(0,0,.9))
text(1974, 132, 'Bad Year', col=rgb(.5,0,.5), cex=1.25) # just for fun
arrows(1973.5, 130, 1973, 127, length=0.05, angle=30, col=rgb(.5,0,.5))
plot(tempr, ylab=expression(T[~t]~~~~(degree~F)), xaxt="no", yaxt='no', type='n')
grid(lty=1, col=gray(.9))
lines(tempr, col=rgb(.9,0,.9))
axis(4)
plot(part, ylab=expression(P[~t]~~~~(PPM)))
grid(lty=1, col=gray(.9))
lines(part, col=rgb(0,.7,0))
title(xlab="Time (week)", outer=TRUE)
# dev.off()
Maybe this is the answer to: how to plot differently scaled multiple time series with ggplot2 … do it in base graphics, dingus!
This is a ggfortify quick plot of all the series on the same graph (makes sense if the scales are at least similar, which they are in this case - if temperature were in ℃, you would have to convert it to ℉ for the plot to be useful).
#png(file='cmort2.png', width=600)
autoplot(cbind(Mortality=cmort, Temperature=tempr, Particulates=part), xlab='Time', facets=FALSE)
#dev.off()
# a similar quick plot in base graphics (not shown)
ts.plot(cmort, tempr, part, col=2:4) # needs a legend
legend('topright', legend=c('Mortality', 'Temperature', 'Pollution'), lty=1, col=2:4)
All 8 explosion series, in exploding color, using ggplot.
#png(file='eqexp.png', width=600 )
library(reshape) # install 'reshape' if you don't have it
df = melt(eqexp[,9:16]) # reshape the data frame
## Using as id variables
Time = rep(1:2048, 8)
ggplot(data=df, aes(x=Time, y=value, col=variable)) +
geom_line( ) +
guides(colour=FALSE) +
facet_wrap(~variable, ncol=2, scales='free_y') +
ylab('')
#dev.off()
# using ggfortify - if you don't want color (not shown)
autoplot(ts(eqexp[,9:16]), ncol=2, xlab='Time')
# and in base graphics (not shown)
plot(ts(eqexp[,9:16]), main='')
###################################################################
# NOTE: The code below using ggfortify used to work - but now it
# doesn't... but it may tomorrow ... the pain of relying on
# packages rears its ugly head
###################################################################
autoplot(ts(eqexp[,9:16]), ncol=2, xlab='Time', colour='blue') +
guides(colour=FALSE) # shuts off legend
Next is a more difficult problem where ggfortify is gg-frightful, but ggplot is gg-ok.
Ok, let’s make it hard… M.I.S.S.I.N.G. D.A.T.A. We’ll use blood, which is a multiple time series file with lots of NAs.
# In base graphics, it is sooooooo simple and the result is decent
#png(file='blood0.png', width=600 )
plot(blood, type='o', pch=19, main='')
#dev.off()
# but let's make it pretty in base
# png(file='blood.png', width=600, height=500 )
# I'm putting a little ↓ space between the plots
par(mfrow=c(3,1), mar=c(.2,3.5,0,2), oma=c(3,0,2,0), mgp=c(2,.6,0), cex.lab=1.5, las=1, bg=gray(.95))
plot(blood[,1], ylab='WBC', xaxt="no", type='n')
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col=gray(.9,.9), border=8)
grid(lty=1, col=gray(1))
lines(blood[,1], type='o', pch=19, col=rgb(30,170,217,max=255))
plot(blood[,2], ylab='PLT', xaxt="no", type='n')
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col=gray(.9,.9), border=8)
grid(lty=1, col=gray(1))
lines(blood[,2], type='o', pch=19, col=rgb(170,30,217,max=255))
plot(blood[,3], ylab='HCT', type='n')
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col=gray(.9,.9), border=8)
grid(lty=1, col=gray(1))
lines(blood[,3], type='o', pch=19, col=rgb(217,78,30,max=255))
title(xlab="Time", outer=TRUE)
title("All Your Base Are Belong to Us, Dingus", outer=TRUE, cex.main=1.5)
# dev.off()
That wasn’t so bad… it’s basically the same code repeated 3 times. I got the colors from color picker.
I gave up trying to do this in ggfortify. It seems like ggfortify works great if your needs are simple, otherwise, you’re screwed. So, here it is with ggplot2. It works ok, but you get warnings and other frustrations you’ll see along the way…
# png(file='bloodgg.png', width=600 )
df = data.frame(day=c(time(blood)), blood=c(blood), Type=rep(c('WBC','PLT','HTC'), each=91) )
# df # uncomment to see the data frame
levels(df$Type) # notice that the factor levels of Type are in alphabetical order...
## [1] "HTC" "PLT" "WBC"
# [1] "HTC" "PLT" "WBC"
# ... if I don't use the next line, the plot will be in alphabetical order ...
# ... if I wanted the series in alphabetical order ...
# ... I would have ordered it that way - so I need ...
# ... the next line to reorder them back to the way I entered the data ...
df$Type = factor(df$Type, levels(df$Type)[3:1])
# any resemblance to the blood work of actual persons, living or dead, is purely coincidental
ggplot(data=df, aes(x=day, y=blood, col=Type)) +
ylab("Mary Jane's Blood Work") +
geom_line() +
geom_point() +
guides(colour=FALSE) +
facet_wrap(~Type, ncol=1, scales='free_y')
## Warning: Removed 9 rows containing missing values (geom_path).
## Warning: Removed 111 rows containing missing values (geom_point).
# Danger, Will Robinson! Warning! Warning! NAs appearing!
# Warning messages:
# 1: Removed 9 rows containing missing values (geom_path).
# 2: Removed 111 rows containing missing values (geom_point).
# We're doomed! Crepes suzette!
#dev.off()
We got the graphic and survived the aliens, too.
Here’s a ribbon plot of the Southern Oscillation Index using ggfortify because it’s an option. I’m adding some bells and whistles using ggplot, but using ggfortify allows me to pass instructions without removing the time series attributes of the data.
# this is a basic one-liner ribbon plot (not shown)
autoplot(soi, ts.geom='ribbon', ts.fill='blue', xlab='Time', ylab="SOI")
# but if you want to get fancy, you have to know some ggplot two, too
#png(file='soi.png', width=600)
d = ifelse(soi<0,0,1)
d1 = d*soi
d2 = (1-d)*soi
autoplot(soi, ts.geom='ribbon', xlab='Time', ylab="SOI") +
geom_ribbon(aes(ymax=d1, ymin=0, fill = "cold")) +
geom_ribbon(aes(ymax=0, ymin=d2, fill = "hot")) +
scale_fill_manual(name='SST', values=c("cold"="blue","hot"="red")) +
theme(legend.position=c(.05,.1))
#dev.off()
Well that might be pretty, but it obscures the trend, don’t you think?
Let’s try a nicer plot using ggplot2
# png(file='soi_nicer.png', width=600, height=320)
df = data.frame(Time=c(time(soi)), SOI=c(soi))
ggplot( data=df, aes(x=Time, y=SOI) ) +
geom_line(col=rgb(0, 0,.9, alpha=.4)) +
stat_smooth(span=1/12, col=6, se=FALSE) + # El Niño
stat_smooth(col=rgb(.7, 0, .7)) # La Tendencia (with CIs)
## `geom_smooth()` using method = 'loess'
## `geom_smooth()` using method = 'loess'
# dev.off()
And a similar plot in base graphics (lowess retains the time axis whereas loess does not)
#png(file="soi_nicer_base.png", width=600, height=320)
par(mar=c(2,2,0,0)+.5, mgp=c(1.6,.6,0))
plot(soi, type='n')
grid(lty=1, col=gray(.9))
lines(soi, col=rgb(0, 0,.9, alpha=.4))
lines(lowess(soi, f=.05), lwd=2, col=6) # El Niño cycle
lines(lowess(soi), lwd=2, col=rgb(.7, 0, .7)) # trend (with default span)
# dev.off()
If you need the intervals (the gray swatch) in base, it’s a bit of a pain. You have to use loess and then predict.loess to get the intervals. But those functions strip the time series attributes so you have to adjust for that….
# Here's how to get a CI, but I'll only do the trend
#png(file="soi_base_ci.png", width=600, height=320)
par(mar=c(2,2,0,0)+.5, mgp=c(1.6,.6,0))
plot(soi, type='n', ylab='SOI')
grid(lty=1, col=gray(.9))
lines(soi, col = rgb(0,0,.9, .4))
lines(lowess(soi, f=.05), lwd=2, col=6) # El Niño cycle
# code above is the same as the previous example...
# ... and trend with CIs from here down
x = c(time(soi)) # remove ts attributes...
y = c(soi) # ... you do it or loess does it for you
lo = predict(loess(y ~ x), se=TRUE) # trend in lo$fit
lines(x, lo$fit, col=rgb(.7,0,.7), lwd=2)
L = lo$fit - qt(0.975, lo$df)*lo$se
U = lo$fit + qt(0.975, lo$df)*lo$se
xx = c(x, rev(x))
yy = c(L, rev(U))
polygon(xx, yy, border=8, col=gray(.6, alpha=.3) )
#dev.off()
This is a plot of S&P 500 returns. The data are in astsa, but it’s an xts file, so for ggfortify, you have to load that first.
library(xts)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
#png(file='sp500w.png', width=600, height=320)
autoplot(sp500w, colour=rgb(0,.6,.6), xlab='Time', ylab="Weekly Returns") +
theme_bw()
#dev.off()
# in 'xts' graphics, you would of course load 'xts' and then
# no color (not shown)
plot(sp500w) # the grid is on top of the line = UGLY (or line <- UGLY for the neRds)
# all dressed up
#png(file='sp500w_xts.png', width=600, height=320)
par(mar=c(2,2,1,0)+1, mgp=c(1.6,.6,0))
plot(sp500w, type='n', ylab="Weekly Returns", main='S&P 500')
lines(sp500w, col=rgb(0,.6,.6))
#dev.off()
Using ggfortify, here’s a discrete-valued series plotted as a step function. EQcount is a counter of certain types of earthquakes in case you miss that.
#png(file='EQcount.png', width=600, height=320)
autoplot(EQcount, ts.geom='point', ts.shape=21, colour='blue', size=2, fill='magenta',
main='Shakin all Over', xlab='Year' ) +
geom_step(x=time(EQcount), y=EQcount, colour='blue') +
theme_bw()
#dev.off()
# and a similar plot using base graphics
# png(file='EQcount_basie.png', width=600, height=320)
par(mar=c(2, 2, 0, 0)+.5, mgp=c(1.6,.6,0))
plot(EQcount, type='n')
grid(lty=1, col=gray(.9))
points(EQcount, pch=21, col=4, cex=1.1, bg = 6) # looks better without this
lines(EQcount, type='s', col=4)
#dev.off()
ggfortify has a bar option (did I mention the duck that went into a bar, ordered a drink and said put it on my bill?)
#png(file='EQcount2.png', width=600, height=320)
autoplot(EQcount, ts.geom='bar', colour=rgb(0,0,.9, alpha=.25), xlab='Year', ylab='Number of BFEs')
# dev.off()
# the base version would be
# png(file='EQcount2_basie.png', width=600, height=320)
par(mar=c(2, 2, 0, 0)+.5, mgp=c(1.6,.6,0))
plot(EQcount, type='n')
grid(lty=1, col=gray(.9))
lines(EQcount, type='h', col=rgb(0,0,.9, alpha=.5), lwd=2)
#dev.off()
If you did not know this already, with time series, the dimensions of the plot matters. Consider these two plots of the bi-annual sunspot numbers. In the first plot, you see that the series rises quickly ↑ and falls slowly ↘ . The second plot obscures this fact.
#png(file='sun1.png', width=600, height=200) # wide and narrow
autoplot(sunspotz, xlab='Time')
#dev.off()
#png(file='sun2.png') # default square (480 px)
autoplot(sunspotz, xlab='Time')
#dev.off()
It is my understanding that R always defaults to a square, no matter which device is called. While squares are ok for toilet paper, they are typically not ok for plotting time series.
And finally, a base graphics plot of the sunspot numbers:
# png(file='sun_cc.png', width=600, height=400)
x = sunspotz
culer1 = rgb(242, 153, 216, max=255)
culer2 = rgb(208, 73, 242, max=255)
culer3 = rgb( 77, 161, 249, max=255)
culer4 = rgb( 0, 200, 225, max=255)
culer5 = rgb(124, 231, 251, max=255)
par(mar=c(2,2,1,1)+2, mgp=c(3,.2,0), las=1, cex.main=2, tcl=0, col.axis=culer1, bg=rgb(.25,.1,.25))
plot(x, type='n', main='', ylab='', xlab='')
rect(par("usr")[1], par("usr")[3], par("usr")[2], par("usr")[4], col='black')
grid(lty=1, col=rgb(1,0,1, alpha=.5))
lines(x, lwd=3, col=culer1)
lines(window(x, start=1800), lwd=3, col=culer2)
lines(window(x, start=1850), lwd=3, col=culer3)
lines(window(x, start=1900), lwd=3, col=culer4)
lines(window(x, start=1950), lwd=3, col=culer5)
title(expression('Psychedelic' * phantom(' Sunspots')), col.main=culer1)
title(expression(phantom('Psychedelic') * ' Sunspots'), col.main=culer5)
mtext('Time', side=1, line=2, col=culer3, font=2, cex=1.25)
mtext('Sunspot Numbers', side=2, line=2, col=culer2, font=2, las=0, cex=1.25)
text(1800, 180, "don't stare at the sunspots", col=culer5, srt=20, font=4)
text(1900, 170, "s.t.a.y c.o.o.l", col=culer1, srt=330, font=4)
text(1850, 160, "dave? dave? \n dave's not here!", col=culer3, font=4)
#dev.off()